Multimodal Integration for Meeting Group Action Segmentation and Recognition

نویسندگان

  • Marc Al-Hames
  • Alfred Dielmann
  • Daniel Gatica-Perez
  • Stephan Reiter
  • Steve Renals
  • Gerhard Rigoll
  • Dong Zhang
چکیده

We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, and can be used either as input for a meeting browser or as a first step towards a higher semantic analysis of the meeting. A common lexicon of multimodal group meeting actions, a shared meeting data set, and a common evaluation procedure enable us to compare the different approaches. We compare three different multimodal feature sets and four modelling infrastructures: a higher semantic feature approach, multi-layer HMMs, a multistream DBN, as well as a multi-stream mixed-state DBN for disturbed data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic recognition of multiparty human interactions using dynamic Bayesian networks

Relating statistical machine learning approaches to the automatic analysis of multiparty communicative events, such as meetings, is an ambitious research area. We have investigated automatic meeting segmentation both in terms of “Meeting Actions” and “Dialogue Acts”. Dialogue acts model the discourse structure at a fine grained level highlighting individual speaker intentions. Group meeting act...

متن کامل

Multimodal Meeting Tracker

Face-to-face meetings usually encompass several modalities including speech, gesture, handwriting, and person identification. Recognition and integration of each of these modalities is important to create an accurate record of a meeting. However, each of these modalities presents recognition difficulties. Speech recognition must be speaker and domain independent, have low word error rates, and ...

متن کامل

Towards a Multimodal Meeting Record

Face-to-face meetings usually encompass several modalities including speech, gesture, handwriting, and person identification. Recognition and integration of each of these modalities is important to create an accurate record of a meeting. However, each of these modalities presents recognition difficulties. Speech recognition must be speaker and domain independent, have low word error rates, and ...

متن کامل

Automatic analysis of multiparty meetings

This paper is about the recognition and interpretation of multiparty meetings captured as audio, video and other signals. This is a challenging task since the meetings consist of spontaneous and conversational interactions between a number of participants: it is a multimodal, multiparty, multistream problem. We discuss the capture and annotation of the AMI meeting corpus, the development of a m...

متن کامل

Interaction-based Information Retrieval in Multimodal, Online, Artefact-Focused Meeting Recordings

Traditional search operations, topic detection or summarisation of meeting recordings are generally performed using segmentation and indexing techniques originally developed in the field of Multimedia Information Retrieval. However, many assumptions possible in a media production environment are often inadequate when applied to spontaneous meeting recordings, as features such as sudden changes ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005